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Abstract 

In this article we discuss the formal structure of a generalized information theory based on the 
extension of the probability calculus of Kolmogorov to a (possibly) non-commutative setting. By 
studying this framework, we argue that quantum information can be considered as a particular 
case of a huge family of non-commutative extensions of its classical counterpart. In any conceivable 
information theory, the possibility of dealing with different kinds of information measures plays a 
key role. Here, we generalize a notion of state spectrum, allowing us to introduce a majorization 
relation and a new family of generalized entropic measures. 
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I. INTRODUCTION 


Quantum Information Theory is not only interesting because of its promising technolog¬ 
ical applications, but also by its impact at the very heart of physics, giving place to a new 
way of studying quantum mechanics [1] and other possible theories as well. In particular, it 
has given rise to a quest for the foundational principles that singularize quantum mechanics 
among a vast family of possible statistical theories UM- The study and characterization 
of quantum correlations plays a central role in this quest |[6j, being entanglement [7H9] and 
discord uni the most important ones. As is well known, probabilities and correlations are 
essential concepts in both classical and quantum information theories. But it turns out that 
the probabilities involved are fundamentally different on each of these theories. In this work, 
we will argue that, due to quantum contextuality and the non-Kolmogorovian nature of the 
underlying probabilities, quantum information theory can be correctly characterized as a 
non-commutative version of its classical counterpart. 

For a statistical theory, it is very illuminating to look at the geometrical aspects of the 
set of possible states. This has been done extensively for the quantum case diHin]. But the 
geometry of a quantum set of states differs radically from that of a classical one. While the 
set of states of classical and quantum systems share the characteristic of being convex sets, 
alike quantum ones, classical models are simplexes. This difference expresses itself also at 
the level of the axiomatization. While Kolmogorov’s axioms suffice for describing classical 
probabilistic models, the Boolean structure of a sigma algebra must be generalized to an 
orthomodular lattice of projection operators for the quantal case m ng EDHao]. 

One may wonder if there are probabilistic models more general than quantum and classical 
ones. This is indeed the case, and we must not go too far from standard quantum mechanics 
in order to hnd them. For example, in algebraic relativistic quantum held theory, states 
may be dehned as measures over Type III factors |2I], a special kind of von Neumann 
algebras [521137| . which differs from the Type I factors appearing in standard quantum 
mechanics liaEHHin]. Type II factors can also be found in algebraic statistical mechanics 
(quantum mechanics with inhnite degrees of freedom) [381 ITT] . Thus, it is clear that states 
dehning measures which go beyond standard quantum mechanics exist, and they appear in 
examples of interest for physics. These new measures, which go beyond the distributive (or 
equivalently, commutative) case of the Boolean sigma algebra, are sometimes called non- 
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Kolmogorovian or non-commutative probabilities. The fact that these non-commutative 
probabilities are involved is responsible for the emergence of the peculiar features of quantum 
information theory. 

But then, one could also imagine a setting where more general probabilistic theories can 
be conceived in order to study its general features and compare them with already known 
ones. This approach has been developed by many authors, and it is fair to say that it is 
based on the study of convex sets of states which define measures on certain algebras of 
observables. These are usually called events or, more generally, effects (see for example 
[52] )• The origins of this approach could be traced to the works of G. Ludwig II211I3] and 
G. Mackey |53], but also to von Neumann. See also |38l [53fl56] for other axiomatizations 
of non-Kolmogorovian probabilities and their relationships with lattice theory. Non-linear 
generalizations of quantum mechanics where studied using a similar approach in [T61 - IT8] . 

It is possible to study many important notions of information theory such as entangle¬ 
ment, discord, and many information protocols in generalized probabilistic models (see for 
example 157]). We will argue in favour of the existence of a generalized information 

theory, continuing the lines of previous works p8| H9] l58] (see also j59] and [60], where 
non-commutative versions of many statistical techniques are studied). By focusing on the 
study of the formal aspects of the probabilities involved in different models, we show that 
the non-Kolmogorovian character of the probabilities underlying the quantum formalism is 
responsible for the emergence of quantum information theory [127] . This allows us to claim; 
Kolmogorovian probabilities imply Shannon’s information theory; the non-commutative prob¬ 
ability calculus of quantum theory, implies quantum information theory. Quantum and clas¬ 
sical information theories appear as particular instances of a formalism based on generalized 
probabilistic measures. 

Any information theory depends strongly on our capability of dealing with different in¬ 
formation measures. This is the case in classical information theory, where Shannon ’s [M], 
Tsallis [62] and Renyi [63] entropies (among other measures) are used for different pur¬ 
poses. A similar diversity of measures should be available in the generalized probabilistic 
setting. Previous works have focused in some entropic measures in the setting of generalized 
probabilities |481 IGTI l65] . In this paper we extend a new family of entropies based on the 
(h, 0)-entropies to the general probabilistic setting [551 157] . These measures include the 
previous ones studied in the literature as particular cases. Another important notions in- 
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troduced in this article are a definition of generalized spectrum for states in general models, 
and a relationship of generalized majorization between states. These are shown to be useful 
for defining functions of states and studying the properties of the entropic measures. 

The paper is organized as follows. In Section |TT] we review classical probabilities in 
the Kolmogorov approach. Next, we turn to important aspects of the quantum formalism 


and the formal structure of probability measures in quantum mechanics in Section III 


In Section |IV| we discuss the formal aspects of a generalized information theory in the 
operational approach. In Section |V| we introduce our new family of information measures 
and the notion of generalized spectrum, which allow us to introduce the concept of generalized 


majorization. Finally, in Section VI we draw our conclusions. 


II. CLASSICAL PROBABILITIES 

One of the most used axiomatizations of classical probability theory is the one of A.N. 
Kolmogorov |6^. If the possible outcomes of an experiment are represented by a set O, 
subsets of it can be considered as representing events. It is usual to restrict events to a 
cr-algebra S of subsets of O. Thus, Kolmogorov defines probability measures as functions p 
such that 

/i : S ^ [0,1] , (la) 

satisfying 

/r(fi) = l, (lb) 

and, for any pairwise disjoint denumerable family {Ajjjg/, 

K[j^i) = ^K^i)- (Ic) 

i&I i£l 

In this way, Kolmogorov’s approach puts probability theory in a direct connection with 
measure theory. From this axiomatic it is straightforward to see that /i(0) = 0 and /i(A‘’) = 
1 — p{A), where (•)'’ means set-theoretical complement. 

There exist many approaches to classical probabilities (see, e.g. |69j for a complete review). 
This subject is too vast to cover it here and goes beyond the scope of this work. We only 
mention the Bayesian school because of its importance and many physical applications [701- 
172] (see also [73]). 
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III. QUANTUM PROBABILITIES 


In this Section we will discuss the special features of the probabilities involved in quantum 
theory. The most salient feature is that alike the classical case, the algebra of events of a 
quantum system is non-Boolean. This is related with the complementarity principle, for 
which incompatible experiments are needed to fully describe quantum phenomena. 

A. Elementary tests in quantum mechanics 

Propositions such as “the value of the energy lies in the interval (a, 6)” or “the particle 
is detected between the interval (a, b)”, are examples of how results of experiments can be 
expressed in quantum mechanics. Elementary propositions of that form are usually called 
events, and they are represented by projection operators as follows. A projective valued 
measure (PVM) is a map M such that 

M : B(M) ^ V{H ), (2a) 

where i?(M) is any Borel set on M and V{'H) is the space of projections on a Hilbert space 
Ti, satisfying 

M(0) = 0 and M(M) = 1 , (2b) 

where 0 is the null space and 1 the identity operator, and 

= (2c) 

i&l i&I 

for any disjoint denumerable family As in the classical case, from this axiomatic 

results that M{B^) = 1 — M{B) = M{B)^ (where (•)■*■ stands for orthogonal complement). 

The spectral theorem allows to assign a PVM to any selfadjoint operator representing a 
physical observable O [SUES] . We denote by Mq its corresponding PVM. Thus, for any 
Borel set (a, 6) G M representing an interval of possible values of O, Mo{{a, b)) = P(a,fe) is a 
projection operator that represents the elementary event "the value of O lies in the interval 
(a, 6)". 

The state of a quantum mechanical system is represented by a density operator p, which 

is semi-definite positive and of trace one m- Given p, the probability that the event 
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represented by P(a,b) occurs is given by the Born’s rule 


p(P(a,fe);p) = Tr(pP(„,b)). (3) 

A generalization of the above mechanism for computing probabilities is given by the notion 
of quantal effects and positive operator valued measures (POVM)) [TSl - IST] . In quantum 


mechanics, a POVM is represented by a mapping 

E : B{R) BfH) , (4a) 

where BifhL) stands for bounded operator, such that 

E{R) = 1, (4b) 

E{B) > 0, for all B e B{R) , (4c) 

E{Bi), for any disjoint family {Bi}i^i. (4d) 

i&I iel 

Then, the probability of effect E given that the system is prepared in state p is given by 

P(E; p) = Tr(pE). (5) 


B. Von Neumann’s axioms 

Is there an analogous of Kolmogorov’s axioms in quantum theory? As we have seen, 
events of a classical probabilistic theory can be represented as subsets of a given outcome set, 
yielding a Boolean a-algebra. Consequently, classical states can be considered as measures 
over Boolean algebras. But as we have seen, the complementarity principle forces the non¬ 
commutativity of certain observables. This makes the algebra of projection operators (i.e., 
the algebra of possible events) non-distributive, and thus, non-Boolean. In this way, quantum 
states can be characterized as measures over non-Boolean algebras as follows: 

s : V{n) [0,1] , (6a) 

such that 

s(l) = 1, (6b) 

and, for a denumerable and pairwise orthogonal family of projections {Pjjjg/, 

= («■=) 

i&I i&l 
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We will refer to the above axioms as Kolmogorov’s axioms. Gleason’s theorem |82] asserts 
that the family of measures obeying von Neumann’s axioms is in bijective correspondence 
with the set of positive trace class operators of trace one, which is nothing but the set of all 
possible quantum states. Thus, von Neumann’s axioms relate quantum states with the non- 


Boolean (or non-commutative) measure theory defined by Eqs. (6a)-(6c). As remarked in 
the introduction, this fact lies behind the distinctive features of quantum information theory. 
Another important remark is that both the collection of all possible measures obeying von 
Neumann’s axioms and the ones obeying Kolmogorov’s form convex sets. This geometrical 
feature can be endowed of a natural physical interpretation: given two probability distribu¬ 
tions, one can always form a mixture of them (and this will be represented mathematically 
by the corresponding convex combination in the state space). 


C. Quantum Correlations 

The non-abelian character of the quantum algebraic setting gives rise to a variety of 
new possibilities regarding correlations. So far, the most important of these novel quantum 
features has been the so called entanglement. First recognized by Schrodinger and Einstein, 
Podolsky and Rosen in 1935, entanglement had remained in the centre of debate, inspiring 
discussions around the completeness of the formalism, the reality and locality of the theory, 
or, more recently, about its status as resource for quantum information processing tasks 
(see, e.g., |H3] for a complete review). 

In the bipartite scenario, a quantum state is said non-entangled if and only if it can be 
approximated by convex linear combinations of product states. As Werner put it in his 
1989 seminal paper, given a joint AR-bipartite state p, the state is separable if there exist a 
probability distribution {p^} and marginal states {p^}, {pf}, such that |9[ IM] 


P = ^VkPk® Pk ■ (V 

k 

Then, p is entangled if it is not separable. This definition can be rephrased in more general 
algebraic terms. Let Ma and Mb be von Neumann algebras acting on a common Hilbert 
space, associated to the A and B subsystems. A state Wp : A/" —)■ C is an expectation value 
functional, where ujpin) = Tr(np) for any observable n G M. Then, ojp on Ma V Mb (the 
smallest von Neumann algebra generated by Ma and Mb) is a product state with respect 
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to Na and Nb iff ojp{ah) = ujp{a)ujp{h) for any a G Na and any b G Mb- If a and b are 
projectors, Wp’s being a product state implies that the probability of measuring ab factorizes 
—the usual criterion for uncorrelation. Moreover, the state Up on M'a V N'b is separable with 
respect to TVa and TV's iff it can be approximated by convex linear combinations of product 
states. Else, it is entangled. 

As claimed before, the non-abelian nature of A/a (A/b) is essential here. No entangle¬ 
ment is possible if the algebras are generated only by commutative observables [Ml l85] . In 
other words: probabilities must be non-Kolmogorovian as a condition of possibility for true 
entanglement. This fact has important consequences for quantum information processing, 
because entanglement plays a key role in the most useful protocols. 

The non-commutativity is also responsible for the perturbation of the joint state when 
measuring over one of its parts. This fact can be quantified by the difference between the 
pre and post-measurement mutual informations after a local (non-selective) measurement, 
a quantity known as discord uniES]. A non-discordant or classically-correlated state p is 
one that can be written as |SZ] 

p = (8) 

where {11^} ({11^}) is a basis of orthogonal projectors on the Hilbert space of A (B), 
and {pij} is the corresponding probability distribution. Also, one can define states that are 
classically-correlated with respect to one of the parts only. For example, p = Yhij Vij^t ® pf 
would be a classical-quantum state. Regarding its accessible information, a classical- 
quantum state can be locally measured in A to obtain maximal information about the 
joint state without perturbing the same state. In the last decade, quantum discord was also 
identified with the quantum advantage for some informational tasks (see [88] for a complete 
review). Notice that in order to have non-null discord, non-orthogonal (i.e., incompatible) 
projections must be involved: this is another way in which the non-Boolean character of the 
event algebra is expressed. 

As we explain below, the notions of entanglement and discord are susceptible to be 
extended upon general probabilistic theories. 

Finally, it is worth to note that there are many other ways to assess the quantum pecu¬ 
liarities. For example, steering —first proposed by Schrodinger j7], and which has recently 
attracted a lot of attention j89H93]— concerns the perturbation of a distant part trough the 
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manipulation of local degrees of freedom, and is closely related to the notion of non-locality. 


IV. GENERALIZED SETTING 

The lattice of projection operators of a separable Hilbert space and that of a-algebras, 
are special instances of orthomodular lattices [2H]- Orthomodular lattices are a suitable 
framework for describing contextual theories: given an orthomodular lattice £, each possible 
context will be represented by a maximal Boolean subalgebra. If the maximal Boolean 
subalgebra coincides with the original lattice, then, the theory will be non-contextual. In 
order to describe theories more general than quantum mechanics, one could generalize the 
above axioms for probability theory to arbitrary orthomodular lattices as follows. Given £, 
define a measure v satifying 


V ■. £ —)■ [ 0 , 1 ], 


(9a) 


such that 

1/(1) = 1, (9b) 

and, for a denumerable and pairwise orthogonal family of events 

"E = E ■ (8c) 

ie/ is/ 

See e.g. |23] for conditions under which these measures exist. It is important to remark that 


Eqs. (la)-(lc) and (6a)-(6c) are just particular examples of the above axioms. But these are 


much more general: in algebraic relativistic quantum field theory and in algebraic statistical 
mechanics more general orthomodular lattices appear ISBEZIES]. Many of the informational 
notions that can be described in quantum mechanics can be generalized to this formal setting 
(see for example |65], |9l] and |95], where the Maximum Entropy principle is analyzed). It 
is also important to mention that other types of non-Kolmogorovian probabilistic theories 
can be conceived (we will not deal with them here, but see for example |96] and EH). 


In Section IIIB we have mentioned that both quantum and classical state spaces are 
convex sets. This has to do with the fact that the collection of measures over an orthomodular 
lattice can be always endowed with a convex set structure (it is straightforward to show this 


9 







for measures obeying axioms (9a)-(9c)). The convex structure of the state space will play a 
key role in probabilistic theories. 

Is it possible to describe a generalized probabilistic theories using convex sets as the start¬ 
ing point? The answer is affirmative (see for example [T61 - 1T8] and HIHIIIIST]). Let us denote 
by C to the set of all possible states of an arbitrary model. It is reasonable to assume that 
C is convex, given the fact that we should be allowed to make mixtures of states. Given an 
observable quantity, denote by X to the set of its possible measurement outcomes. Given 
an arbitrary state n & C and any outcome x E X, a number z/(a;) G [0,1] should be assigned, 
representing the probability of obtaining the outcome x given that the system is prepared in 
state u. Using this, for outcome x we can dehne an affine evaluation-functional E^C —)■ [0,1] 
in a canonical way by Ex{n) = i>{x). 

As C is convex, it can be naturally embedded in a vector space V{C). Thus, any affine 
functional acting on C belongs to a dual space V*{C). It is very natural then to consider 
any affine functional : C —)■ [0,1] as representing a possible measurement outcome or 
generalized effect, and as above, to interpret E{v) as the probability of hnding the outcome 
represented by effect E if the system is prepared in state v. It is very natural also to assume 
that there exists a normalization functional uq such that 'Uc(^) = 1 for all z/ G C (in the 
quantum case, this functional is represented by the trace functional). A (discrete) observable 
will be then represented by a set of effects {Eff such that 'Yhi = '^c- 

C will be said to be hnite dimensional if and only if V{C) is hnite dimensional. In this 
paper, we will restrict for simplicity to this case and to compact sets of states. These 
conditions imply that C will be expressed as the convex hull of its extreme points. As in the 
quantum and classical cases, extreme points of the convex set of states will represent pure 
states. 

Dehne a hnite dimensional simplex as the convex hull of d-l-l linearly independent points. 
A system is said to be classical if and only if it is a simplex. It is a well known fact that in 
a simplex a point may be expressed as a unique convex combination of its extreme points. 
This characteristic feature of classical theories no longer holds in quantum models. Indeed, 
in the case of quantum mechanics, there are inhnite ways in which one can express a mixed 
as a convex combination of pure states (for a graphical representation, think about the 
maximally mixed state in the Bloch sphere). 
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Interestingly enough, there is also a connection between the faces of the convex set of 
states of a given model and its lattice of properties (in the quantum-logical sense), providing 
an unexpected connection between geometry, lattice theory and statistical theories mm 
[98]. F is a face if for all x satisfying 


X = \xi + (1 — A)x2, 0 < a < 1, (10) 

then X G F if and only if Xi G F and X 2 G F m Thus, faces of a convex set can be 
interpreted geometrically as subsets which are stable under mixing and purihcation. It is 
possible to show that the set of faces of any convex set can be endowed with a lattice structure 
in a canonical way. For a classical model (i.e., described by a simplex) it turns out that the 
lattice is Boolean. Thus, probabilities defined by clasical state spaces are Kolmogorovian. 
On the other hand, in QM, the lattice of faces of the convex set of states (dehned as the set 
of positive trace class hermitian operators of trace one), is isomorphic to the von Neumann 
lattice of closed subspaces V{'H) fTTl E3] . This is nothing but saying that quantum states 
obey von Neumann axioms. In this way, a clear connection can be made between the 
approach based on orthomodular lattices and the approach based on convex sets. A similar 
result holds for more general (but not all) state spaces, but we will not deal with this problem 
here (see j2^ and m for more discussion on this subject). 

It is very important to remark that general probabilistic models will fail to be Kolmogoro¬ 
vian in general. This has important conseguences for the possible correlations that can be 
defined between different systems, and thus, for information theoretical purposes. 

We mention hnally an important remark about the different degrees of generality that 
can be attained using different frameworks. It is very reasonable to start with measures over 
orthomodular lattices, mainly because this framework includes an important family of phys¬ 
ical examples (such as classical statistical theories, quantum mechanics, quantum statistics 
and relativistic quantum held theory), but also because it allows to represent complemen¬ 
tarity in a very direct way. But more general models of interest can be constructed. For 
example, cr-orthomodular posets can be used as events algebras (by dehning measures sim¬ 
ilarly as those dehned by axioms]^. All orthomodular lattices are cx-orthomodular posets, 
but the last ones are more general, because they can fail to be lattices [55] • Finally, the 
approach that uses convex sets as a starting point is more general that the one provided 
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by orthomodular lattices (this is so because it is possible to find models for which no or¬ 
thocomplementarity relation can be defined |23l| . and thus their lattice of faces fails to be 
orthomodular). Notwithstanding, in order to illustrate the most salient features of non- 
Kolmogorovian probabilistic models, it is sometimes sufficient to stay in the orthomodular 
lattices setting. This is what we will do mostly in this paper (but we will consider some 
more general examples in Section 0, 

A. Non-Kolmogorovian Information Theory and Contextuality 

Complementarity and contextuality [MmT]. are salient features of quantum theory. The 
role of the complementarity principle in quantum information theory was discussed in |102] . 
where it is shown that it is crucial for understanding the main features of quantum infor¬ 
mation protocols. One of the most important formal expressions of the complementarity 
principle is that of the non-commutativity of operators representing physical observables. 
And this is intimately connected with the non-Boolean structure of the lattice of projection 
operators. Furthermore, the success of the most important quantum computation algorithms 
is explained under the light of the projective geometry underlying the formalism of quantum 
theory in nng. 

To see how this contextual structure reappears in a more general setting, consider an 
orthomodular lattice L. A maximal Boolean subalgebra is a subset S C £, such that: 1) 
B is closed and is a Boolean algebra with respect to the operations inherited from £ (i.e., 
it is a Boolean subalgebra) and 2) if B' is another Boolean subalgebra such that B ^ B\ 
then B = B' (i.e., it is maximal). The important thing for us is that maximal Boolean 
subalgebras can be considered as representing particular experiments to perform on the 
system. To illustrate this point, think of a spin ^ system. If we want to measure the spin 
component along axis ;§, this will be represented by operator Then, this operator has 
associated the Boolean subalgebra {0, |+)(+|, |—)(— I"'', 1}, representing all possible events 
defined by the experiment which consists of measuring that quantity; spin up in direction 
5 (“I +)(+!”)5 spin down in direction i (“|—)(—I"*"”)! ^^e contradiction “ 0 ” and the tautology 
“ 1 ” (which are the analogous of “0” and the whole outcome set “12” in the classical case 
respectively). 

Denote by B to the set of all possible Boolean subalgebras of an orthomodular lattice £. It 
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is possible to show that C can be written as the snni of its maximal Boolean snbalgebras una. 


£= \J 


( 11 ) 


B6B 


What is the meaning of this technical resnlt for generalized probabilistic theories? If £ is 
Boolean, the resnlt is trivial: the system can be described by nsing a single probability 
distribntion over a single experimental setnp. If it is not Boolean, it means that the event 
algebra of onr theory may present mntnally complementary contexts. In other words, we 
will need to perform incompatible experiments (each one represented by a maximal Boolean 
snbalgebras) in order to fnlly describe phenomena. Notice that each generalized state s, when 
restricted to a maximal Boolean snbalgebra i3, gives a Kolmogorovian probability measnre 
s|b. Taken together with Eq. 01. this implies that a generalized state on a contextual 
model can be considered as a collection of classical probabilities indexed by each empirical 


setup. The generalized measnre obeying Axioms (9a)-(9c) provides a coherent pasting of 
this collection of Kolmogorovian measnres. In the qnantnm case, this role is played by the 
density matrix representing the state of the system. 


These featnres can be taken as a starting point in the convex sets approach. For example, 
in [39| (see also |105j and |106j h a state s is considered as a list of probability distribntions; 
■s = {p{hW))i=o,...,n-i]W=Xo,....,Xm-i- The possible IT’s represent a set of hdncial measnre- 
ments and the Ts label the ontcomes of each measnrement. Fidncial measnrements represent 
sets of measnrements ont which the state can be determined. To hx ideas, let ns look in 
detail at the qnbit. In this case, each state can be specihed as s = {p{i, W))i=o^i.w=aa:PyPz- 
The observables represented by d'x,d'y, are snfficient to determine completely the state 
(i.e., they form a fiducial set). Notice that from this perspective, a state is considered again 
as a collection of classical probability distribntions. 


Non-Kolmogorovian probabilities are a condition of possibility for a departnre from Shan¬ 
non’s classical information theory. This can be nnderstood in a simple way following a 
generalization of the R. T. Cox approach to probability theory as follows: 


• R.T. Cox [7D1 ITT] showed that if a rational agent is confronted with a Boolean algebra 
representing empirical events, then, any fnnction measnring his degree of belief on the 
occnrrence of any event mnst be eqnivalent to a classical probability calcnlns. In |in7] . 
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it is shown that if a rational agent is confronted with a non-distributive algebra of 
physical events, then, the consistent probabilities must be those of Equations 

• In a similar way, in the Cox approach it is shown that Shannon’s entropy is the 
more natural information measure for an agent confronted with a Boolean algebra of 
events. But in [5H], it is shown that if the algebra is replaced by a non-Boolean one, 
then von Neumann and Measurement entropies must be used. In other words: if the 
event algebras are non-Boolean, then probabilities must be non-Kolmogorovian, and 
information measures depart from Shanon’s entropy and more general classical ones 
(we will discuss the specihc form of this departure in Section [V]). 

This is expressed clearly in the formal structure of classical and quantum information the¬ 
ories as follows. In Shannon’s theory, a source emits different messages x of an outcome 
set X with probabilities px'- this means that the probabilities involved are nothing but a 
Kolniogorovian measure over the Boolean algebra generated by the possible outcomes of 
the source. This implies that Shanon’s entropy will play a key role in the formalism. For 
example, in the noiseless-channel coding theorem, the value of the Shannon’s entropy of 
the source H{X) measures the optimal compression for the source messages |1U8) . What 
changes in the quantum setting? Due to the fact that the hnal output of the source is now 
represented by a density matrix p = 'YhVxPx (he., by a non-Kolmogorovian measure), then, 
the von Neumann’s entropy comes into stage. This is expressed for example, in Schumacher’s 
quantum coding theorem, in which the optimal bound for coding is expressed in terms of 
this quantity |10911110] . 

The role of the non-Kolmogorovian probability involved in the quantum state emitted by 
the source is also expressed in the existence of the Holevo bound: the mutual information 
between emitter and receiver will be bounded from above by a quantity depending on the 
von Neumann’s entropy S{p) 


HX:Y)i,S(p)-Y,P.S(p.) ( 12 ) 

where I{X : Y) represents the classical mutual information between random variables X 
and Y. The above bound means that there is an intrinsic limit to the information accessible 
to the receiver. For example, it can be shown that if the original mixture is formed by non- 
orthogonal states, the Holevo bound implies that I{X : Y) is strictly less than H{X) (the 
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Shannon’s measure of the source), and then, it is impossible for the receiver to determine 
X perfectly if he measures the observable V mu. This implies that if the states prepared 
by the emitter are non-orthogonal, it will not be possible for the receiver to determine the 
emitted state with certainty. This impossibility is directly related to the complementarity 
principle, and thus, to the non-Kolniogorovian character of the emitted quantum state. 

B. Communication And Correlations In the Generalized Setting 

Communication is a central aspect of any possible kind of information theory. But 
communication involves more than one party: a message (or something) must be sent from 
one party to another. This is why the study of correlations is so important in order to 
account for the probabilistic aspects of a source. In order to show that informational notions 
can be studied in the general setting described above, a suitable description of multipartite 
states and correlations is needed. This has indeed been done quite extensively mau, 
and many notions essential to quantum information processing (such as entanglement, no¬ 
cloning, no-boradcasting and teleportation), can be generalized and studied in arbitrary 
statistical models. A departure of classical information theory will be found in state spaces 
for which non-classical probabilities and correlations are involved, and we will review how 
this is directly related to the non-Kolmogorovian structure of the state space. 

Let us consider a compound system, formed of parties A and B, with state spaces Ca and 
Cb respectively. The joint system will also have a state space, let us denote it by Ca®Cb (the 
meaning of this notation is clarified below). In order to study its mathematical features, 
let us suppose that Ca ® Cb can be included in the linear span of V{Ca) ® V{Cb) (this 
assumption is discussed in |46jj. Consider the set which contains all bilinear functionals 
(p : V (Ca) X V (Cb) —>■ M satisfying (p(B, E') > 0 for all effects E and E' and (p(uA, ub) = 1- 
It is very reasonable then, to call this set a maximal tensor product state space for 

A and B. Ca ®maxCB has the property of being the biggest set of states in (V(Ca) 0 V(Cb))* 
which assigns probabilities to all product measurements. 

Analogously, a minimal tensor product state space Ca ®min Cb can be defined as the 
convex hull of all product states. This will be the analogous of the convex set of separable 
states in quantum mechanics (see |52] for more discussion on this). We will write a product 
state as pa ® satisfying 
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va ® vb(E, E') = va(E)vb(E') , 


( 13 ) 


for all pairs {E,E') G V*(Ca_) x V*(Cb)- Given these two extreme possibilities (maximal 
and minimal tnesor product state spaces), the set of states Ca 0 Cb of an actual model lies 
somewhere “in between”: 


f'A ®min ^B — Cb ^ Ca ®max Cb • 


max 


(14) 


For classical compound systems (for which state spaces are simplices representing Kol- 
mogorovian probabilities), the set of compound states equals to the minimal tensor product 
(and is again a classical state space). This means that if both subsystems are classical, we 
recover the equality: CA®minCB = CA®maxCB- It can be shown that for quantum mechanics 
we have the strict inclusions Ca ®min Cb^Ca®Cb^Ca ®max Cb- 

With this formal setting, it is now very natural to introduce a general dehnition of 
separable state in an arbitrary convex operational model. This is done in an analogous way 
to that of [H] (see for example milSU]: 

Definition 1. A state v & Ca®Cb will be called separable if there exist pi G M>o, i^a ^ 
and G Cb such that 



(15) 


Entangled states are thus dehned as those which are not separable. It can be easily checked 
that entangled states exist if and only if ® is strictly greater than Ca ®m,in Cb- Thus, 
no entangled states exist for classical theories. In this way, non-classical correlations will 
not be allowed, and no departure of classical information theory will be found. 

It is worth noting that this generalization of entanglement, although natural, is by no 
way unique, neither the most general possibility. In |11211114] Barnum et al. propose a 
subsystem-independent concept of entanglement, where the focus is in the relation between 
the convex set of states and a preferred (relevant or prescribed by any means) set of effects. 
Then, entanglement becomes as a relative notion of purity of the states with respect to the 
relevant effects (see |1121111411115] for details). Being independent of a certain subsystem 
decomposition, this notion becomes substantially more general than the usual one, even in 
the quantum scenario (see, e.g. mMni). 
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Regarding discord, Perinotti studied a possible introduction of the notion in general 
probabilistic theories jSD]- As the original dehnitions of quantum discord relies on the 
information content of the states, and because the information measures are not uniquely 
dehned for general probabilistic theories, Perinotti prefers to give an operational dehnition 
of discord. He starts by dehning the set of null-discord states and proves that they can be 
expressed as 


^nd = ® ) (16) 

i&I 

where is a set of jointly perfectly distinguishable pure states, is a set of 

arbitrary states of B, and {qi}i£i is a probability distribution (see [50] for details). Then, the 
discord of a state u is dehned as the minimal operational distance to the set of null-discord 
states Vtnd'- 


V{u) := min \\n - Und\\op ■ (17) 

The operational distance is dehned through the minimum error probability in discrimination 
of both states HZQj. 

The fact that correlations between diherent parties can be studied using information 
measures in the generalized setting, allows to pose the problem of communication in a 
suitable mathematical form. Given that the probabilistic models involved can be non- 
Kolmogorovian, the departure from Shannon’s formalism is unavoidable in most cases. 


V. GENERALIZED ENTROPIES AND MAJORIZATION 

In this Section we extend the dehnition of classical and quantum Salicru entropies to 
the case of general probabilistic theories. In addition, we introduce dehnitions of spectra of 
states and majorization in those theories. 

A. Entropies and majorization in classical and quantum theories 

Inspired in |121j . Salicru et al. have introduced a very general expression for entropies |66) . 
which we call as classical {h, 0)-entropies, as follows 
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Definition 2. For an N-dimensional probability vector p = {pi} with Pi > 0 and 
Ylf=iPi ~ 1? classical [h, cj))-entropies are defined as 

H{h,<p){p) = h 

where entropic functionals h : R ^ R and 0 : [0,1] h->■ i? are continuous with 0(0) = 0 
and /i(0(l)) = 0, and are such that either: (i) h is increasing and 0 is concave, or (ii) h is 
decreasing and cj) is convex. 

It is straightforward to see that this definition yields the most renowned entropies, namely 
Shannon [6T], Tsallis isa and Renyi ones |63| as particular cases. Indeed, one key property 
that all these entropies share is related to the concept of majorization II22I- Majorization 
gives a partial order between probability vectors and it is defined as follows: for given 
probability vectors p and q of length N sorted in decreasing order, it is said that p is 
majorized by g, denoted as p -< q, when 

n n N N 

'^Pi <^qi loi a\\n = l,...,N -l and '^Pi = ^ Qi- (19) 

2=1 2=1 2=1 2 = 1 

In [671 1123] , it has been shown that classical (h, 0)-entropies are Schur-concave, that 

is, preserve the majorization relation; if p -< g ^ Many properties 

of Salicru entropies can be proved using majorization, e.g the lower and upper bounds: 

On the other hand, it is quite natural to define a quantum (h, 0)-entropies replacing 
probability vector by density operator and the sum by the trace in Def. as follows EH 

Definition 3. Let us consider a quantum system described by a density operator p act¬ 
ing on an N-dimensional Hilbert space R. The quantum {h^fi)-entropies (under the same 
assumptions for h and 0 in Def. are defined as follows 

H,„)(p) = ft(Tl'.^(p)). (20) 

As in the classical counterpart, the quantum {h, 0)-entropies include as particular cases 
von Neumann ra, and quantum version of Renyi and Tsallis entropies. It can be shown 
that if the probability vector p is formed by the eigenvalues of p, then 

H(p</,)(p) = . (21) 
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In other words, quantum (h, 0)-entropies are nothing more than classical (h, 0)-entropies of 
the probability vectors formed by eigenvalues of density operators. 

Let us consider two density operators p and a with p and q vectors formed by eigenvalues 
sorted in decreasing order, respectively. Now, p is majorized by a, denoted as p -< a, means 


that p -< g in the sense of Eq. (19). It can be shown that quantum (h, ^j-entropies are also 
Schur-concave IE 


Let p{E-, p) be the probability vector whose components are given by the Born rule for 
a rank-one POVM E and state p, that is Pi(Ep p) = TrpEj. An alternative definition of 
quantum (h, ^j-entropies, which is equivalent to Def. but with more physical meaning 
related to the probability of measurement, is the following |S7|: 

Definition 4. Under the same assumptions in Def. the quantum (h, (f)-entropies are also 
defined as 

H(p0)(p) = mmH(^h,<p){p{E-,p )), (22) 

where E is the set of all rank-one POVMs. 

Further properties of classical and quantum (h, 0)-entropies are given in jHI] (and refer¬ 
ences there in). 


B. Entropies and majorization in general probabilistic theories 

Now, we aim to extend the definition of (h, (;/))-entropies to more general probabilistic 
theories. It is possible to do this at least in two different ways. First, one could start 
with an atomic orthomodular lattice C defining an algebra of events. A frame in C will 
be an orthogonal set of atoms such that Vie/®* “ represent maximal 

experiments. For example, in quantum mechanics, any orthonormal basis (or rank-one 
PVMs) is a frame. Thus, for each frame E = {ojlie/ ^md each state we have Pi = v{ai). 
Then, {pi] defines a probability vector and this allows us to define the (h, 0)-entropies 
relative to that frame 

In order to give a definition independent of the frame, we have to take the minimum over 
all possible frames: 
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Definition 5. Let us consider a state z/ G C. The general {h,(j))-entropies (under the same 
assumptions for h and cf in Def. are defined as follows 

H(h, 0 )(z/) = inf (24) 

J- Ell? 

where F is the set of all frames. This is the canonical way in which entropies can be 
defined in general probabilistic theories. We observe that this approach resembles Def. 
for the quantum case. Measurement entropy given in |48l IM] is a particular case of this 
approach. But it also includes other quantities, such as Renyi and Tsallis in the case of 
general probabilistic theories. Notice that by taking the minimum over all possible frames, 
the contextual structure of the probability measures involved is made explicit. 

There is another possible way in which (h, 0)-entropies in quite general probabilistic 
theories can be defined: we will provide a generalization of Def. For this task, we have 
to define the notions of generalized spectrum and majorization. We restrict to arbitrary 
(compact) convex sets of states in finite dimensions; for these spaces, each element can be 
written as a convex combination of its pure states (as is the case in quantum and classical 
mechanics). In other words, there exist pure states {z/j} such that every state v can be 
written as 


v = 'Y^PiVi. (25) 

i 

But this decomposition is not, in general, unique. For instance, the maximally mixed state 
in quantum mechanics has infinite decompositions even in terms of orthogonal pure states. 
Notwithstanding, the probability vectors defined by the coefficients of these decompositions 
are all the same. Notice that this uniqueness property, needs not to be true for arbitrary 
models as we will discuss below. 

We introduce now our notion of generalized spectrum inspired in the Schrodinger mixture 
theorem (see e.g. PH Th.8.2]). Using this theorem, it can be shown that the probability 
vector formed by the coefficients of any convex pure decomposition of a quantum state is 
majorized by the one formed by its eigenvalues. In other words, the spectrum of a quantum 
state has the distinctive property of being the majorant of all possible probability vectors 
originated in convex decompositions in terms of pure states. We will abstract this property, 
and use it for defining a generalized spectrum for generalized states as follows. Given a 
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probabilistic model described by a compact convex set, let My be the set of probability 
vectors of all possible convex decompositions of a state v in terms of pure states, that is 

My := {p{v) = {pi} I V = for pure Ui} . (26) 

i 

Then, we propose the following 

Definition 6. Given a state u, if the majorant of the set My (partially ordered by majoriza- 
tion) exists, it is called the spectrum of v and it is denoted by p{u) . 

Accordingly, the corresponding generalized spectral decomposition is 

u = '^Pi9i. (27) 

i 

Notice that our dehnition reduces to the usual one for classical theories (where the sets 
of states are simplexes) and also in quantum mechanics. In the former case, equivalence can 
be checked easily, because there is only one convex decomposition in terms of pure states. 
In the latter case, as noted above, equivalence is a consequence of the Schrodinger mixture 
theorem. Notice however, that for a general statistical theory described by a compact convex 
set, it could be that the supremum p{u) does not exist for all possible states. 

We observe that an alternative dehnition of generalized spectrum has been recently intro¬ 
duced by Barnum et al. in lESj. The authors dehne the spectrum of a state as the unique 
(up to permutations) convex decomposition into perfectly distinguishable pure states. Dis- 
tinguishability has the following operational meaning: a set of states {z/j} is perfectly distin¬ 
guishable if there is a measurement {Ej} such that Ej(z/j) = 6ij. It is important to remark 
that their dehnition of spectrum cannot be used in arbitrary state spaces. This is due to 
the fact that for certain spaces, the decomposition of a state into perfectly distinguishable 
pure states can fail to be unique, and diherent decompositions can yield diherent probability 
vectors. Spaces for which decomposition into perfectly distinguishable states always exist, 
are said to satisfy the weak spectrality axiom (WS-spaces). In spaces satisfying strong spec- 
trality (S-spaces), the probability vectors of the convex pure decomposition into perfectly 
distinguishable states are unique (up to permutations). It can be shown that there are 
WS-spaces which are not S-spaces, and then, the dehnition of spectrality presented in |125| 
doesn’t works in those cases. The dehnition presented in |125] and ours yield the same 
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Figure 1: The generalized spectral decomposition —Eq. (27)— can be computed in a variety of 
probabilistic theories, (a) When the convex set is a simplex, the decomposition in terms of pure 
states is unique and so it determines the spectrum of u. In the triangle above, u can be written 
in a unique way as a mixture of i^i, V2 and (b) For the state i/ of a qubit, the spectrum is 
given by the eigendecomposition of its density matrix in terms of the orthogonal pure states ui and 
V 2 - The same happens for any other quantum mechanical model, (c) For a general probabilistic 
theory, there are, a priory, many decompositions of a state in terms of pure ones, and we have to 
look for the majorant one. For example, for the non-regular polygon with four vertices the state 
in the barycenter \s u = + ^i >2 = xv'i -|- (1 — with x > ^. The second set of coefficients 

majorize the first one, so p{i') = {x, 1 — x} constitute the spectrum of v. Note, however, that in 
both decompositions the pure states are perfectly distinguishable. 


result for classical and quantum state spaces. But they are expected to be non-equivalent in 
the general case. There could be spaces for which certain states admit different probability 
vectors for distinct decompositions into perfectly distinguishable pure states, but for which 
it is still possible to hnd a maximum according to our dehnition (see for example Fig. [^. 
It is an interesting open question to determine under which conditions both dehnitions are 
equivalent, and specially, the range of validity of Def. This last task can be rephrased 
as follows: which are the spaces for which a generalized version of the Schrodinger mixture 
theorem is valid? We will not deal with this problem here; we will only restrict to show how 
our dehnition can be used to dehne generalized majorization, functions over states and, in 
particular, entropic measures. 

Def. can be used to introduce naturally the concept of generalized majorization as 
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follows. 


Definition 7. Given two states fi and v, one has that fi is majorized by v, denoted by u, 
if and only if 

Pin) -< p{iy) , (28) 

where p{p) and p{u) are the corresponding generalized spectra from Def. 

Moreover, our definition of generalized spectrum can be also used to evaluate a function 
0 in a generalized state as follows. For any possible mixture {pi, z/j} of u, we define the 
application of a functional 0 to the state given the mixture as 

(29) 

i 

In particular, we are interested in the mixture {pi, Pj}, which leads to the definition 


4'(v) — iPlVlthM ■ 


(30) 


We have seen in Section IV that the partial trace of the quantum formalism can be extended 
to the general setting by using the normalization functional uc- This allow us to define 
alternative generalized {h, (;/))-entropies. 


Definition 8. Under the same assumptions that in Def. we define the {h, cf)-entropies 


H(/.,*)(!') = h{uc(<t>{v))) ■ 


(31) 


In other words, these generalized entropies are are equal to the classical ones evaluated on 
the probability vector p{v), that is 


In principle, it can be shown that all the properties of classical (and quantum) [h, (f)- 
entropies that are based on majorization and Schur-concavity holds in this general case 
(further properties are under investigation). 
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VI. CONCLUSIONS 


In this paper we have discussed the formal aspects that show that quantum informa¬ 
tion theory arises as a non-Kolmogorovian version of Shannon’s information theory. In 
other words, when the probabilities involved are measures over projection lattices of Hilbert 
spaces, we obtain quantum information theory. On the other hand, when the algebra of 
events is a Boolean one, we recover Shannon’s formalism. This structure is reencountered 
in the generalized setting, where many informational notions, such as correlations between 
different parties and information protocols can be described. In this way, quantum and 
classical information theories appear as particular cases of a generalized non-Kolmogorovian 
probabilistic calculus. In particular, we have shown that the Salicru entropies can be de¬ 
fined in the non-Kolmogorovian setting, extending the catalogue of extant entropic measures 
available in the literature. In doing so, we have also introduced a definition of spectrum 
for generalized measures which relies in an essential property derived from the Schrodinger 
mixture theorem, and which allows to introduce a new notion of generalized majorization 
and functions of states (such as the generalized entropies introduced in Def. |^. 
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